Skip to content

Performance tuning for vec search #186

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
arkohut opened this issue Jan 23, 2025 · 1 comment
Open

Performance tuning for vec search #186

arkohut opened this issue Jan 23, 2025 · 1 comment

Comments

@arkohut
Copy link

arkohut commented Jan 23, 2025

Hi, I’ve been using sqlite-vec to build my personal screen retrieval project, Pensieve, but I’ve encountered some performance-related issues recently.

Currently, I’m running this project on a M1 Max MacBook, with approximately 250,000 records in the database. When performing vector searches, it has started to become very slow. Here’s a rough summary of the situation:

Image

I’m not sure if this is a normal performance level? Below is my table structure and the corresponding SQL:

  CREATE VIRTUAL TABLE entities_vec_v2 USING vec0(
      embedding float[1024] distance_metric=cosine,
      file_type_group text,
      created_at_timestamp integer,
      file_created_at_timestamp integer,
      file_created_at_date text partition key,
      app_name text,
      library_id integer
  )

  SELECT rowid
    FROM entities_vec_v2
    WHERE embedding MATCH ?
      AND file_type_group = 'image'
      AND K = 48
      AND library_id IN ?
    ORDER BY distance ASC
@brankoradovanovic-mcom
Copy link

No, this is not normal performance. I've tried sqlite-vec with 400,000 rows (embedding size=768), and a query like this runs in less than a second on a warmed-up database (i.e. cached, without I/O).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants